# Multilingual instruction
Llama 3.3 70B Instruct 4bit DWQ
4-bit DWQ quantized version of the Llama 3.3 70B instruction-tuned model, optimized for efficient inference on the MLX framework
Large Language Model Supports Multiple Languages
L
mlx-community
140
2
Qwen3 30B A3B Quantized.w4a16
Apache-2.0
INT4 quantized version of Qwen3-30B-A3B, reducing disk and GPU memory requirements by 75% while maintaining high performance.
Large Language Model
Transformers

Q
RedHatAI
379
2
Qwen3 32B FP8 Dynamic
Apache-2.0
An efficient language model based on Qwen3-32B with FP8 dynamic quantization, significantly reducing memory requirements and improving computational efficiency
Large Language Model
Transformers

Q
RedHatAI
917
8
Llama 3.2 3B Instruct GGUF
Llama-3.2-3B-Instruct GGUF is a 3B-parameter large language model released by Meta, utilizing IQ-DynamicGate technology for ultra-low-bit quantization (1-2 bits), optimizing inference performance while maintaining memory efficiency.
Large Language Model Supports Multiple Languages
L
Mungert
656
3
Qwenphi 4 0.5b Draft
Apache-2.0
Built upon Qwen2.5-0.5B-Instruct, with the vocabulary transplanted from microsoft/phi-4, it can be used as a draft model for Phi-4.
Large Language Model
Transformers Supports Multiple Languages

Q
rdsm
27
4
Chocolatine 2 14B Instruct V2.0.3
Apache-2.0
Chocolatine-2-14B-Instruct-v2.0.3 is a large language model based on the Qwen-2.5-14B architecture, fine-tuned with DPO, specializing in French and English tasks, and excels in the French LLM leaderboard.
Large Language Model
Transformers Supports Multiple Languages

C
jpacifico
329
14
Phi 3 Medium 128k Instruct
MIT
Phi-3-Medium-128K-Instruct is a lightweight open-source model with 14 billion parameters, focusing on high quality and strong reasoning capabilities, supporting a 128K context length.
Large Language Model
Transformers Other

P
microsoft
17.52k
381
Featured Recommended AI Models